﻿Tehnici de Ingineria Limbajului Natural Curs 7 Nivelul discursului: Teoria Structurilor Retorice și Teoria Nervurilor (coeziune) Curs: Dan Cristea Laboratoare: Diana Trandabăț, Mihaela Onofrei, Daniela Gîfu, Ionuț Pistol Overview • Rhetorical Structure Theory • Linguistic motivations for • Basics of Veins Theory (VT) • Conjecture 1: cohesion 4 Rhetorical Structure Theory 5 Rhetorical structure theory (William Mann and Sandra Thompson, 1987) Basics • text span: un uninterrupted linear interval of text • relation: holds between two or more non-overlapping spans • arguments of relations are of a nuclear (N) type and a satellite (S) type – a nucleus is more important than a satellite (deletion and substitution tests) – relations: hypotactic (one nucleus + satellites) and paratactic (all nuclear) • schema: integrates by a relation two or more text spans (like grammar rules) • RST analysis are trees • they reflect a judge interpretation (therefore could be subjective) RST: a relation definition EVIDENCE constraint on N: Reader (R) might not believe N to a degree satisfactory to Writer (W) constraint on S: R believes S or finds it credible effect: R's belief of N is increased CONCESSION relation CONCESSION constraint on N: W has a positive regard to the situation presented in N constraint on S: W is not claiming that the situation presented in S doesn't hold constraint on the combination N+S: W acknowledges a potential incompatibility between the situations presented in N and S; W regards the situation presented in N and S as compatible effect: R's positive regard for the situation presented in N is increased CIRCUMSTANCE relation CIRCUMSTANCE constraint on N: none constraint on S: S presents a situation constraint on the combination N+S: S sets a framework (spatial or temporal) within which R is intended to interpret the situation presented in N effect: R recognizes that the situation presented in S provides the framework for interpreting N RST relations Subject matter Presentational (informational) Elaboration (intentional) Circumstance Motivation Solutionhood Antithesis Volitional Cause Background Volitional Result Enablement Non-Volitional Cause Evidence Non-Volitional Result Justify Purpose Concession Condition Otherwise Interpretation Evaluation Restatement Summary Sequence Contrast A more complex example a Jack și Sue s-au dus să-și cumpere o nouă mașină de tuns iarba b deoarece cea veche le-a fost furată c Sue îi văzuse pe oamenii ce au luat-o d și i-a urmărit un timp în josul străzii, e dar aceștia au dispărut cu un camion f După ce au căutat într-un magazin g ei au înțeles că nu pot să-și permită una nouă h Apropos, Jack și-a pierdut slujba luna trecută i astfel încât momentan era într-o situație financiară dificilă j El încercase să-și găsească alta k dar până acum nu avusese prea mare noroc l Ei au găsit în cele din urmă una de vânzare la mâna a doua într-un garaj 17 a Jack și Sue s-au dus să-și cumpere o nouă mașină de tuns iarba b deoarece cea veche le-a fost furată c Sue îi văzuse pe oamenii ce au luat-o d și i-a urmărit un timp în josul străzii, e dar aceștia au dispărut cu un camion f După ce au căutat într-un magazin g ei au înțeles că nu pot să-și permită una nouă h Apropos, Jack și-a pierdut slujba luna trecută i astfel încât momentan era într-o situație financiară dificilă j El încercase să-și găsească alta k dar până acum nu avusese prea mare noroc l Ei au găsit în cele din urmă una de vânzare la mâna a doua într-un garaj 18 Anaphoric references: some at distance a Jack și Sue s-au dus să-și cumpere o nouă mașină de tuns iarba b deoarece cea veche le-a fost furată c Sue îi văzuse pe oamenii ce au luat-o d și i-a urmărit un timp în josul străzii, e dar aceștia au dispărut cu un camion f După c au căutat într-un magazin g ei au înțeles că nu pot să-și permită una nouă h Apropos, Jack și-a pierdut slujba luna trecută i astfel încât momenta era într-o situație financiară dificilă j El încercase să-și găsească alta k dar până acu nu avusese prea mare noroc l Ei au găsit în cele din urmă una de vânzare la mâna a doua într-un garaj 19 RST related readings Mann,W and Thompson,S (1987): Rhetorical Structure Theory Marcu,D (2000): The theory and practice of discourse parsing and summarization, The MIT Press Linguistic evidences for Veins Theory 21 Are you comfortable with finding a referent for it in unit 3? 1 With one year before finishing his mandate as president of the company, 2 Mr W Ross has begun to bring about its bankruptcy 3 There were rumors that he has obtained it by fraud 22 Are you comfortable with finding a referent for it in unit 3? 1 With one year before finishing his mandate as president of the company, 2 Mr W Ross has begun to bring about its bankruptcy *3 There were rumors that he has obtained it by fraud 23 How about now? 1 Mr W Ross has begun to bring about the bankruptcy of his company 2 with one year before finishing his mandate as president 3 There were rumors that he has obtained it by fraud 24 How about now? 1 Mr W Ross has begun to bring about the bankruptcy of his company 2 with one year before finishing his mandate as president 3 There were rumors that he has obtained it by fraud 25 Who is she in unit 4? 1 John told Mary that he loves her 2 He has never been married 3 and lived until his 40s with his mother 4 She, on the contrary, was married twice 26 Veins Theory – basics 33 Fundamental assumption in VT (the cohesion claim) • An inter-unit reference is possible only if the two units are in a structural relation one with the other • The nucleus-satellite distinction, as a component of the discourse structure, gives indications on the range of referents to which an anaphor can be resolved 34 The definitions of VT: heads • Head expression of a node: the sequence of the most important units within the corresponding span of text: – the head of a terminal node: itself (its label) – the head of a non-terminal node: the concatenation of the head expressions of the nuclear daughters • The important units are projected up to the level where the corresponding span is seen as a satellite 36 The account of VT on cohesion References from a given unit are possible mainly in its domain of accessibility In particular: – (1) In most cases, if B is a unit and b∈B is a referential expression, then either b directly realizes a center that appears for the first time in the discourse, or it refers back to another center realized by a referential expression a∈A, such that A∈acc(B) Such type of references are called direct references – (2) If (1) is not applicable, then if A, B, and C are units, c∈C is a referential expression that refers to b∈B, and B is not on the vein of C (i e , it is not visible from C), then there is an item a∈A, where A is a unit on the common vein of B and C, such that both b and c refer to a In this case we say that c is an indirect reference to a – (3) If neither (1) nor (2) is applicable, then the reference in C can be understood without the referee, as if the corresponding entity were introduced in the discourse for the first time Such references are inferential references 51 Consider the text: 1 John sold his bicycle 2 although Bill would have wanted it 3 He obtained a good price for it, 4 which Bill could not have afforded 5 Therefore he decided to use the money to go on a trip Jack & Sue's binary RST structure 74 Anaphoric references: some at distance a b c d e f g h i j k l Jack și Sue s-au dus să-și cumpere o nouă mașină de tuns iarba deoarece cea veche le-a fost furată Sue îi văzuse pe oamenii ce au luat-o și i-a urmărit un timp în josul străzii, dar aceștia au dispărut cu un camion După c au căutat într-un magazin ei au înțeles că nu pot să-și permită una nouă Apropos, Jack și-a pierdut slujba luna trecută astfel încât momenta era într-o situație financiară dificilă El încercase să-și găsească alta dar până acu nu avusese prea mare noroc Ei au găsit în cele din urmă una de vânzare la mâna a doua într-un garaj V=agl DEA=a V=abgl DEA=ab V=abcdegl DEA=abc V=abcdegl DEA=abcd V=abcdegl DEA=abcde V=afgl DEA=af V=a(f)gl DEA=afg V=aghl DEA=agh V=aghil DEA=aghi V=aghjkl DEA=aghj V=aghjkl DEA=aghjk V=agl DEA=agl 76 Experiment 2: potential to establish correct co-reference links Compare Linear-k and Discourse-VT-k models: – For each k , each re, and each model M (Linear or VT) • p(M-k,re,DEAk) = { 1, re can be resolved to antecedents in DEAk 0, otherwise p(M-k,Corpus) = åre ÎCorpus p(M-k,re,DEAk) Potentials Experiment 3: the effort required to find antecedents • Compare Linear-k and Discourse-VT-k models: For each k, each re, and each model M (Linear or VT) d<k, the distance between re and the closest • e(M-k,re,DEAk) = { antecedent in DEAk k, if no such antecedent exists • e(M-k,Corpus) = åre ÎCorpus e(M-k,re,DEAk) Efforts